-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
child_process: fix handleless NODE_HANDLE handling #13235
Conversation
BTW thanks to @cjihrig as he came up with a test case and helped debugging the issue. |
Removed the test as it doesn't look reliable at all. If somebody has a better idea for a test, it would be great. |
lib/internal/child_process.js
Outdated
// It can happen the handle is not received because of some error on | ||
// ancillary data reception such as MSG_CTRUNC. In this case don't try to | ||
// perform any conversion. | ||
if (!handle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... I understand the reasoning behind this but I'm wondering if this is the right approach. If the handle is invalid at this point, should we treat it as an error condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessarily an error. For example, connections can be closed by the client while they wait in a queue to be sent from the master to the worker. I think the correct thing to do is advocate that people check their received handles in the worker prior to using them (#13196).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as far as I can tell there are two cases where the sending a handle ends up with no handle on reception:
- As Colin said, if the handle is closed while waiting in the sending queue. This case is detected by the sending process and instead of sending a
NODE_HANDLE
message, it sends a normal ipc message. In this case, thesendHandle
argument of themessage
listener callback is set to null. - When for some reason, the handle cannot be duped on the receiving process. This is the case this PR tries to fix and I think it makes sense that it behaves in the same way as the previous case (setting
sendHandle
tonull
).
I think there's another argument not to treat it as an error. We can send data (the message
argument) along with the handle that can be useful on reception even if there's no handle associated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'm convinced :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a better approach be to send a "please resend the handle" message to the master instead of NODE_HANDLE_ACK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense. Do you have something in mind on how would that work on the sending process?
I think resending the message automatically can lead to problems if the receiving process constantly fails. Maybe using the process.send()
callback could be a good idea. So if the message sent is a NODE_HANDLE
wait for a NODE_HANDLE_ACK
or NODE_HANDLE_NACK
to call the callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I said "master" when I should have said "parent process."
You are right that unchecked repeated resending is no good. The cluster module could deal with it by sending the handle to another worker after a few turns but that isn't applicable to the child_process module.
I don't have a great solution in mind but as a strawman: try three times and close + log a warning after that? If it happens frequently enough and bothers users enough they will file bug reports about it and we can try to come up with something better.
We could cheat when the ChildProcess instance is managed by the cluster module and do the send-to-another-worker thing. It would be tight coupling but that doesn't bother me personally. Your call.
lib/internal/child_process.js
Outdated
@@ -516,6 +516,12 @@ function setupChannel(target, channel) { | |||
// a message. | |||
target._send({ cmd: 'NODE_HANDLE_ACK' }, null, true); | |||
|
|||
// It can happen the handle is not received because of some error on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you replace "It can happen the handle" with "It is possible that the handle"
a884a07
to
ed6e9fb
Compare
PR updated with @bnoordhuis suggestion. The new commit message would look:
|
lib/internal/child_process.js
Outdated
target._send(target._pendingMessage.message, | ||
target._pendingMessage.handle, | ||
target._pendingMessage.options, | ||
target._pendingMessage. callback); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Erroneous whitespace
lib/internal/child_process.js
Outdated
process.emitWarning('Handle didn\'t reach the receiving process ' + | ||
'correctly', 'SentHandleNotReceivedWarning'); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified:
if (target._pendingMessage) {
if (message.cmd === 'NODE_HANDLE_ACK') {
closePendingHandle(target);
} else if (target._pendingMessage.retransmissions++ ===
MAX_HANDLE_RETRANSMISSIONS) {
// If the handle didn't reach the receiving process correctly retry up
// to 3 times. If it still fails, close it and print a warning.
closePendingHandle(target);
process.emitWarning('Handle didn\'t reach the receiving process ' +
'correctly', 'SentHandleNotReceivedWarning');
}
}
lib/internal/child_process.js
Outdated
handle: handle, | ||
options: options, | ||
retransmissions: 0 | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could shorten this to:
target._pendingMessage =
{ callback, message, handle, options, retransmissions: 0 };
lib/internal/child_process.js
Outdated
target._pendingHandle.close(); | ||
target._pendingHandle = null; | ||
if ((message.cmd === 'NODE_HANDLE_ACK') || | ||
(message.cmd === 'NODE_HANDLE_NACK')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Superfluous parens.
b4815fb
to
1dda284
Compare
PR updated with comments from @BridgeAR and @bnoordhuis addressed. PTAL. Thanks! |
ping @bnoordhuis @cjihrig :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I remember looking at this last week but I must've forgotten to actually submit my review.
LGTM and sorry for the delay!
lib/internal/child_process.js
Outdated
target._send(target._pendingMessage.message, | ||
target._pendingMessage.handle, | ||
target._pendingMessage.options, | ||
target._pendingMessage.callback); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put braces around the body for legibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests would be great, but LGTM.
lib/internal/child_process.js
Outdated
} else if (target._pendingMessage.retransmissions++ === | ||
MAX_HANDLE_RETRANSMISSIONS) { | ||
closePendingHandle(target); | ||
process.emitWarning('Handle didn\'t reach the receiving process ' + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nit: could you change didn't to did not.
All nits addressed.
I agree, I tried something that worked for me locally but wasn't reliable at all. If anyone has suggestions would be great. |
It is possible that `recvmsg()` may return an error on ancillary data reception when receiving a `NODE_HANDLE` message (for example `MSG_CTRUNC`). This would end up, if the handle type was `net.Socket`, on a `message` event with a non null but invalid `sendHandle`. To improve the situation, send a `NODE_HANDLE_NACK` that'll cause the sending process to retransmit the message again. In case the same message is retransmitted 3 times without success, close the handle and print a warning. PR-URL: nodejs#13235 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
Landed in 71ca122. Thanks! |
It is possible that `recvmsg()` may return an error on ancillary data reception when receiving a `NODE_HANDLE` message (for example `MSG_CTRUNC`). This would end up, if the handle type was `net.Socket`, on a `message` event with a non null but invalid `sendHandle`. To improve the situation, send a `NODE_HANDLE_NACK` that'll cause the sending process to retransmit the message again. In case the same message is retransmitted 3 times without success, close the handle and print a warning. PR-URL: nodejs#13235 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
It is possible that `recvmsg()` may return an error on ancillary data reception when receiving a `NODE_HANDLE` message (for example `MSG_CTRUNC`). This would end up, if the handle type was `net.Socket`, on a `message` event with a non null but invalid `sendHandle`. To improve the situation, send a `NODE_HANDLE_NACK` that'll cause the sending process to retransmit the message again. In case the same message is retransmitted 3 times without success, close the handle and print a warning. PR-URL: #13235 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
It is possible that `recvmsg()` may return an error on ancillary data reception when receiving a `NODE_HANDLE` message (for example `MSG_CTRUNC`). This would end up, if the handle type was `net.Socket`, on a `message` event with a non null but invalid `sendHandle`. To improve the situation, send a `NODE_HANDLE_NACK` that'll cause the sending process to retransmit the message again. In case the same message is retransmitted 3 times without success, close the handle and print a warning. PR-URL: #13235 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
Landed on |
It is possible that `recvmsg()` may return an error on ancillary data reception when receiving a `NODE_HANDLE` message (for example `MSG_CTRUNC`). This would end up, if the handle type was `net.Socket`, on a `message` event with a non null but invalid `sendHandle`. To improve the situation, send a `NODE_HANDLE_NACK` that'll cause the sending process to retransmit the message again. In case the same message is retransmitted 3 times without success, close the handle and print a warning. Fixes: #9706 PR-URL: #13235 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
It is possible that `recvmsg()` may return an error on ancillary data reception when receiving a `NODE_HANDLE` message (for example `MSG_CTRUNC`). This would end up, if the handle type was `net.Socket`, on a `message` event with a non null but invalid `sendHandle`. To improve the situation, send a `NODE_HANDLE_NACK` that'll cause the sending process to retransmit the message again. In case the same message is retransmitted 3 times without success, close the handle and print a warning. Fixes: #9706 PR-URL: #13235 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
It is possible that `recvmsg()` may return an error on ancillary data reception when receiving a `NODE_HANDLE` message (for example `MSG_CTRUNC`). This would end up, if the handle type was `net.Socket`, on a `message` event with a non null but invalid `sendHandle`. To improve the situation, send a `NODE_HANDLE_NACK` that'll cause the sending process to retransmit the message again. In case the same message is retransmitted 3 times without success, close the handle and print a warning. Fixes: #9706 PR-URL: #13235 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Colin Ihrig <[email protected]>
It can happen that
recvmsg()
may return an error on ancillary datareception when receiving a
NODE_HANDLE
message (for exampleMSG_CTRUNC
). This would end up, if the handle type wasnet.Socket
,on a
message
event with a non null but invalidsendHandle
. Avoidthis by checking the
handle
validity before performing thehandleConversion
.Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
child_process